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ABSTRACT 



The CLEAN algorithm, widely used in radio interferometry for the deconvolution of radio images, performs well only if the raw radio 

image (dirty image) is, to good approximation, a simple convolution between the instrumental point-spread function (dirty beam) 

and the true distribution of emission across the sky. An important case in which this approximation breaks down is during frequency 

04 . synthesis if the observing bandwidth is wide enough for variations in the spectrum of the sky to become significant. The convolution 

assumption also breaks down, in any situation but snapshot observations, if sources in the field vary significantly in flux density over 

the duration of the observation. Such time-variation can even be instrumental in nature, for example due to jitter or rotation of the 

primary beam pattern on the sky during an observation. An algorithm already exists for dealing with the spectral variation encountered 

in wide-band frequency synthesis interferometry. This algorithm is an extension of CLEAN in which, at each iteration, a set ofN 'dirty 

beams' are fitted and subtracted in parallel, instead of just a single dirty beam as in standard CLEAN. In the wide-band algorithm the 

beams are obtained by expanding a nominal source spectrum in a Taylor series, each term of the series generating one of the beams. 

In the present paper this algorithm is extended to images which contain sources which vary over both frequency and time. Different 

Q . expansion schemes (or bases) on the time and frequency axes are compared, and issues such as Gibbs ringing and non-orthogonality 

$—( ' are discussed. It is shown that practical considerations make it often desirable to orthogonalize the set of beams before commencing 

TTt. ' the cleaning. This is easily accomplished via a Gram-Schmidt technique. 
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J^ 1. Introduction by the greater propensity of compact, extragalactic sources to 

^;^ scintillate at low galactic latitudes). It is even more difficult to 

OO . It has been known for some decades that some radio sources vary ^^^^^^^ ^j^^ ^^^^^-^^ ^f ^^^-^^^^ ^^^^^^^ ^j^i^j^ ^^ ^^y ^-^^^ 

Csj ■ m intensity over time (Dent I965p. The range of time scales de- ^^ ^^ ^^^ ^ j^ ^^^ ^^^^^^^ necessary to determine this 

^. tected so far, roughly from minutes to decades, is set by the prac- ^^^^^^^ ^^^^^^ extravagant amounts of observing time. 

t — ticalities of measurement rather than by any intrinsic properties ^. , ., , , , .^ ,,■ . i -„„.i 

0-, ,.. 1 »• -ru / • J vu Of presently-available survey results, De Vries et al (1200411, 

of the source populations. The present paper IS concerned with . ^ .-^ . ^,.,,.,^,,^ \ f 

_j ' u- u uu-^ • £ » ,.■ ■ a A ■. in a two-epoch comparison of a high latitude held, found only 

^^ . sources which exhibit signmcant variation in flux density on „ ^ ,_-),- J; ,, r, . • • ,- -, . r, , 

__i , ■■ , J,, ^u "lA u I 11 J T * n. -i? ■ ui ~ 0.2 deg - at 1.4 GHz flux densities of > 2 mJy. Bower et al 

'^ ' time scales of less than 24 hours (so-called Intra-Day Variables ,-^^_, , ° , , ,- , ,t » / . ^ ^ ^t, 

>• • . xrN,, N J u- u 1 ■ » J -^u » J J » » 02OO7D have processed a large amount of VLA 5 and 8.4 GHz 
or IDVs) and which are also associated with extended structure ^ — r-r' ,,,^, .,,••, ,-,^,_9 
,' f ■ 1 ui u ^ A A- ■ ^ t » -ru archival data and hnd a snapshot incidence of 1.5 deg ra- 

•^ of a size resolvable by present-day radio interferometers. The ,, . r, , ■ • , ^^^ . t, , 

K> . . . , -^ f \ ^^ ■ u \^ ^^i, A dio transients at flux densities greater than 350 /iJy. Becker et al 

Y^ structure makes It of interest to image such sources, but the rapid ^„,„ ,, , ,-?,,«, • ,- . , • 

lJ ' • U-1-* u- J *» * » J T» • *u A-a^ u- • (2010) compared 3 epochs of VLA observations of the galactic 

ij , variability can hinder attempts to do so. It is these dimculties in , ^ ^ , ^ , , ^ , _9 , , , , ^^ 

C^ ., • . ^ ... • e in\7 I.- I. ^i. » plane at 6 cm and found ~ 1 .6 deg sources between 1 and 100 

■ - - I the interferometnc imaging of IDV sources which the present ^ ^ ,.,,, ,,• ,, , ^^^ t^ 

• ■ • ,. ■■ mJy which had a modulation depth greater than 50%. Regarding 

paper IS designed to address. , •',„,, ^ . ,, , . ^, , , j ^„„,. . ^ ° , .° 

,, , ^, jru.. n* the IDV fraction, Kedziora-Chudczer et al (1200 ID found, in 

How many such sources are there, and of what types.' Any , . , . ^ , ,, , . , „ ^ V^ir n i 

„;:,.., r, •,.,•, ,u J- their multi-frequency study (between 1.4 and 8.6 GHz), that 

estimation of the incidence of variability among the radio source , ■ ,.c ^ i • i -■ t^t t i • i n ^ • 

, ,. . V . J u .u ^ A CCA ■ about half of their 13 BL Lac obiects, and a smaller fraction 

population IS complicated by the extra degrees of freedom im- ^ ■ •, • ■ tt^, , i i ^^ i i • 

^,.^. . a .1- u. T. • 1 • •. ui .u . .u n of Quasars, exhibited IDV up to about 10% modulation; more re- 

plicit in a non-flat light curve. It IS also inevitable that there will ^, . ' , , i ^„„ni • i ■ ^ 

r , ,. cc . A . .1, ■ U-1-. f.u • . ..A centiv Lovell et al ( I2008I I. in a more comprehensive survey at 5 

be selection ettects due to the inability of the instrument to de- „,, , ^^ — "f. ■ . .^ a i i 

. , J 1 .• J .u u 1 ^ ■ ^ cc .u • •.• •* GHz, report that 37% of their 443 flat-spectrum sources showed 

tect modulation depths below a certain cutott, or the insensitivity . ' ^ ■ , .,. ^ , • , 

„ ^, , . . ^ . , ., ^ ■ signmcant variability on a 2-day time scale, 

of the observing regime to time scales outside a certain range. '^ ■' ■' 

The diff^erence between the population of objects observable in Sources which exhibit a combination of IDV plus some re- 

our galaxy and those which are extra-galactic introduces a de- solved structure fall into several classes, which are briefly re- 

pendence of incidence on galactic latitude (further complicated viewed in the following paragraphs. 

Novae are expected to exhibit significant changes in radio 

Send offprint requests to: I. M. Stewart flux density at intraday time scales. In the standard model, the 
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flux density from an expanding isothermal shell is expected ini- 
tially to increase proportional to the square of time since the 
outburst (Hjellming 1996), although some recent VLBA obser- 
vations appear to be inconsistent with that picture (Krauss et al 
1201 11 1. However, most novae at least in the past decade have not 
been resolved by radio interferometry until of order 100 days 
after the outburst, by which time the fractional change in flux 
density has fallen to about a percent or two per day (see e.g. 
Eyres et al l2000l Eyres et al 2005 Heywood & O'Brien l2007l l. 
A recent exception is the 2006 outburst of the recurrent nova 
RS Ophiuchi (O'Brien et al 2006a). This was observed with the 
VLBA first on day 14 after the outburst, and the European VLBI 
Network (EVN) from day 20 (O'Brien et al 2006b). The size of 
the source at 6 cm wavelength grew from approximately 20 to 35 
mas between these observations. The flux density was no longer 
varying rapidly by that time, but the earliest observations with 
MERLIN on day 4 after the outburst show it then varying by a 
factor of 2 over less than a day. Model calculations indicate that 
RS Oph would have been resolvable by the EVN at that time. 

X-ray binaries form an interesting class of galactic IDV 
sources. Several of these are associated with resolved structure 
and have emitted jansky-strength radio flares which show pro- 
nounced evolution in intensity on intraday time scales. Examples 
include Circinus X-1 (Haynes et al|T978, Tudose et al 2008]), 
C ygnus X-3 (Tudose et al '20091 and GRS 1915+105 (Fender et 
al [19991 Rushton et al 2010). 

Extragalactic IDV sources fall into fewer categories and are 
in fact dominated by flat-spectrum AGNs, or objects nowadays 
collectively described as blazars (Wagner & Witzel 1995 Lovell 
et al 2008). One could almost say that being a blazar is both a 
necessary and a sufficient condition for an extragalactic source to 
exhibit IDV, since these objects have the combination of bright- 
ness and small size which allows them on the one hand to be 
detectable at great distances and on the other to be compact 
enough to exhibit IDV. In many of these objects the compact 
core appears to be embedded in radio structure which is resolv- 
able by current interferometers (see for example Kovalev et al 
12005 ^ Ojha et al'2006^. 

IDV among blazars is an area of current interest. IDV mecha- 
nisms for these sources fall into two categories: variation which 
is intrinsic to the object, and variation caused by the propaga- 
tion medium (scintillation in the local interstellar medium, or 
ISS). Both appear to play some role but the relative proportions 
are not clear Spectral dependence of the variability at cm wave- 
lengths, annual modulation of the time scales and time shifts be- 
tween measurements made at different locations all offer strong 
evidence for the dominance of ISS for some sources (Bignall et 
al 2009 and references within). Correlation between variation at 
radio and other wavelengths (Quirrenbach et al 1991 ) or a bad fit 
to ISS spectral models (Fuhrmann et al l2008l) tend to support the 
intrinsic origin of IDV in others. Intrinsic origin presents theo- 
retical challenges because it is difficult to explain it without in- 
voking either an incidence of extreme Doppler factors which is 
hard to accept on statistical grounds, or brightness temperatures 
in excess of the inverse-Compton limit of about 10'^ K (Jauncey 
et al 1200 I V Some questions remain also in the ISS picture, such 
as a relatively weak correlation with galactic latitude (Kedziora- 
Chudczer et al 2001 Lovell et al 2008); the connection between 
scintillation and scatter-broadening is also still unclear (Ojha et 
al l2006l Lazio et al l2008b . 

The only non-blazar category of extragalactic IDV source 
which is relevant to the present paper is the maser. McCallum et 
al (J2009) for example show that the H2O megamasers in Circinus 
are both resolvable and exhibit IDV. The techniques described in 



the present paper can make it easier to image IDV masers which 
have superimposed lines, meaning they must both be present in 
the same channel map. 

It can thus be seen that there are many objects which it is 
desirable to image at radio wavelengths with as high resolution 
as possible, but whose variation on intraday timescales inter- 
feres with imaging via earth-rotation aperture synthesis. Broadly 
speaking, the problem occurs because the IDV causes the pat- 
tern of lobes around the time-variable component of the source 
to be poorly matched to the dirty beam. Such structure cannot 
be completely removed by the standard CLEAN algorithm, and 
will therefore tend to limit the dynamic range of the image. As 
further described in section l272l the situation is analogous to the 
problem in multi-frequency synthesis (MFS), caused by differ- 
ences in spectral index among objects in the field. The argument 
for seeking an improved way to image IDV sources is the same 
as for this MFS case. In fact, as is shown in the present paper, 
the same technique, originally developed for MFS by Sault and 
Wieringa d 19941 ). can be applied to both situations. 

The plan of the paper is as follows. Section |2] contains a 
brief outline of aperture synthesis theory. In sections 13.11 and 
13.21 the generalized A^-beam Sault- Wieringa theory is described 
in detail. The desirability of orthogonalizing the beams is dis- 
cussed in section 13.31 Different time and frequency basis func- 
tions are compared in section 14711 with emphasis on the avoid- 
ance of Gibbs-like phenomena. In section lT.2.21 a dual expansion 
in both frequency and time axes is shown to be both necessary 
and effective in cleaning a simulated wide-band data set which 
includes sources which vary significantly both in frequency and 
time. Finally, in section l4.2.3l a simple 2-beam expansion is de- 
rived which is shown to be a powerful means of increasing dy- 
namic range in many IDV cases. 

Brief descriptions of the methods described here have al- 
ready been presented elsewhere in earlier stages of development 
(Stewart|2008|and|2010]). 



2. Interferometry 

2.1. Aperture synthesis 

A radio interferometer measures cross-correlations between the 
voltage signals detected by pairs of antennas. Each correlation 
yields a complex number which encodes information about the 
amplitude and phase of the signals from all sources of radio 
waves in the field of view of the antennas. Each complex corre- 
lation can be considered to be a point sample, plus added noise, 
of a continuous function V(u), known as the visibility function. 
The vector u, known as a baseline, is the separation vector be- 
tween any given pair of antennas, expressed as a number of 
wavelengths. For an array of antennas which is physically copla- 
nar, V becomes a 2-dimensional function, and can be shown to 
be the Fourier transform of a function /(r) which is related to 
the sky brightness distribution at the detector wavelength. Here 
r = (I, m) are the sines of the zenith angles of a given sky loca- 
tion. The relation between / and the true sky brightness distribu- 
tion is given by 



/(r)=A(r) 



/ti-ue(r) 

vr 



(1) 



r ■ r 



where A, known as the primary beam, is the receptivity of the 
individual antenna elements as a function of r. 

If delays are introduced into the signal chains such that sig- 
nals from a single point in the sky (known as the phase centre) 
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arrive at the correlator exactly in phase, then the Fourier relation- 
ship to the sky brightness distribution becomes approximately 
true even for non-coplanar arrays, provided the portion of sky 
to be imaged is restricted to a sufficiently small region around 
the phase centre. In this general, non-coplanar case, the baseline 
vectors u are considered to be the separation vectors between the 
pairs of antennas projected onto a plane normal to the phase cen- 
tre. The direction cosines / and m are also in this case taken in 
a basis in which (/, ni) - (0, 0) lies in the direction of the phase 
centre, rather than the zenith. 

If the sampling function (which is, to first approximation, a 
sum of delta functions) is denoted by S then the output of the 
interferometer is VS . The Fourier inversion 'S^'^iVS) gives D - 
I * B where D, known as the dirty image, is a convolution of the 
true sky image / with the 'dirty beam' B - 3"' (5). B plays the 
same role for an interferometer as the point-spread function of a 
traditional telescope. 

In general, the larger the number of samples of V, the more 
compact the distribution of flux density in B, and therefore the 
closer the correspondence between D and the true sky /; the sen- 
sitivity of the observation will also be increased. An interfer- 
ometer containing A^ antennas will generate N{N - l)/2 inde- 
pendent samples of V at each observation. The number of sam- 
ples can be further expanded via the technique of Earth-rotation 
synthesis. In this, the rotation of the Earth over the course of a 
day is used to generate a sequence of different baselines u for 
each antenna pair. Another way to increase the number of sam- 
ples is to observe at several frequencies, a technique known as 
frequency synthesis. Because baselines are expressed in wave- 
lengths, the fixed spatial separation between a pair of antennas 
generates baseline vectors u of different length at diff'erent fre- 
quencies. Implicit in these two synthesis techniques is the as- 
sumption that the sky brightness will be constant over time in 
the first case and over frequency in the second. 

A more detailed description of the fundamentals of interfer- 
ometry can be found in many sources, for example Thompson, 
Moran & Swenson (.200 Ij or Taylor, CariUi & Perley (fT999] l. 

2.2. CLEAN 



The CLEAN algorithm was invented by Hogbom ( I1974I I and has 
been further elaborated by Clark (1980") and Cotton and Schwab 
(Schwab 1984) among others. The essence of the algorithm is to 
perform many iterations of a process in which a small amount 
of the dirty beam is subtracted, centred at the highest remaining 
point in the dirty image, ideally until nothing remains in this im- 
age but noise. The positions and amounts subtracted are recorded 
as 'clean components' and used afterwards to reconstruct an ap- 
proximation to the true sky image /. The gain factor by which 
the dirty beam is multiplied, and the number of iterations to per- 
form, are parameters which are chosen ahead of time by the user. 

CLEAN was originally presented as an empirical algorithm 
which appeared to produce results, although it required some 
experience to judge how best to apply it. It is known not to per- 
form well when applied to extended objects (Cornwell l 19831 1. al- 
though a modified algorithm has been shown to yield improve- 
ments here (Steer, Dewdney & Itoh 1984). CLEAN is still in 
wide use as a practical method of removing sampling artifacts 
from interferometry images. 

The reasons for CLEAN's success were unclear for some 
time (Corn well et al 1999), and its theoretical basis has come 
under occasional criticism (Tan 1 19861 Lannes et al l 1 997) . Only 
relatively recently has it been shown to be related to compres- 



sive sampling (Candes & Wakin l2008l l and been given a sound 
theoretical underpinning. 

In order for Hogbom CLEAN to work, the convolution re- 
lation D - I * B must be valid. There are cases where this as- 
sumption breaks down however. One of the most severe depar- 
tures occurs in frequency synthesis with large fractional band- 
widths. It is common for a single observational field to contain 
objects whose spectral indices differ by several tens of percent 
or more. Differences in spectra of this order are unimportant if 
the observation fractional bandwidth is small but may signifi- 
cantly degrade the convolution assumption where the fractional 
bandwidth approaches 1 . 

Conway et al Cl990j showed that, in the wide-band case in 
which the dirty image D no longer well approximates a convo- 
lution of the sky brightness distribution /, it was nevertheless 
possible to express D as a sum over a relatively small number 
A^ of component images D,, each of which individually obeys a 
convolution relation D,- - /, * Z?,. Conway et al arrived at this by 
expanding the nominal spectrum at each sky location in a Taylor 
series, each /th term of the series generating a respective 'spec- 
tral dirty beam' B,-. In this case each image 7, is simply a sky map 
of the value of the /th Taylor coefficient. Conway et al suggested 
a coordinate transform for better application of this technique 
to commonly-found power-law radio spectra, and presented an 
approximate method to solve for the coefficient images 7, when 
A^ is restricted to 2. This method consists of a 2-step sequential 
CLEAN and relies on the beams Bo and B\ being approximately 
orthogonal. 

Sault and Wieringa (|1994| l retained the Taylor expansion but 
devised a new solution algorithm which can be thought of as a 
generalization of the Hogbom CLEAN algorithm from its orig- 
inal 'scalar' context, in which a single image D is iteratively 
deconvolved, to a new 'vector' context in which A^ images Dj 
are deconvolved in parallel. Orthogonality of the beams is no 
longer required (although the technique fails if 2 or more beams 
are identical). Sault and Wieringa elaborated their theory as it 
applied to the N -2 case but, as the authors themselves suggest, 
the extension to A^ > 2 is not difficult. 

The CLEAN algorithm continues to be a subject of active 
development. Recent work includes an extension of CLEAN 
to produce clean components with a range of sizes (Cornwell 
2008), a modification to clean tomographic LISA images 
(Hayama et al 2006), and an adaption of the algorithm to re- 
construct RHESSI images of solar flares (Schwartz 2009 ). The 
compressive-sampling formalism has recently also generated 
some promising new approaches (Wiaux et al |2009l Li, Cornwell 
& de Hoog|20n]l. 



3. A generic multiple-beam CLEAN 

3.1. Generalized Conway decomposition 

In its most general form, the aim of the treatment described by 
Conway et al (1990) is to find a way to decompose the dirty 
image into a sum of convolutions 



D 



Z^'-Z^'*^' 



(2) 



(=0 



/■=() 



The problem is to calculate a set of beams B, for which this rela- 
tion is true. There are no doubt many ways to do this, but in the 
present section we are going to consider only the route which 
comes from expanding the frequency and time dependence of 
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the sky brightness in a sum of basis functions. Such an expan- 
sion is written 



p-i Q-i 



I(r, y,t)- l^l^ Ip,gir) Fp(v) Tg(t), 

p=() q=0 



(3) 



where /(r, v, t) is the brightness function; r = (/, m) are direction 
cosines of the angular displacement from the phase centre; v is 
frequency and t time; Ip,q{r) is a sky map of the (p, q)th com- 
ponent; and Fp and T^j are pth and ^th members respectively of 
sets of basis functions. Earth-rotation and frequency synthesis 
will return a set of j - 1 to M measurements Vj of the cross- 
correlations between antennas. Ignoring for the sake of simplic- 
ity non-coplanarity, weights, the shape of the primary beam, and 
the spherical projection correction (1 - r ■ r)""^ (see equationlTJ, 
each Vj is related to / by a transform: 

Vj - I dr exp(-2m u ■ r) /(r, Vj, tj). 

Although each Vj represents an average over a small time 
X bandwidth window, if this window is small enough, Vj may 
be represented as a point sample of the visibility function at the 
spatial frequency Uj. The total set V of visibility measurements 
can thus be written 

M „ 

V = V (J(u - Uj, V - Vj, t - tj) I dr exp(-2;r/ u ■ r) /(r, v, 0- 

Expanding / in its basis functions gives, with some rearrange- 
ment; 

M „ 

V = 2 2 (5(u - u,) Fp(vj) Tgitj) I dr exp(-2m u ■ r) Ipjr). 

Fourier inversion of V yields the dirty image D. This is indeed 
found to be a sum of convolutions; 



(4) 



where 



r M 



B 



pq 



Y,S{u-Uj)Fp{yj)T,{tj) 



yj=i 



Here the 3 symbol represents the Fourier transform. The two 
sums over p and q are easily collapsed to one, in which case 
equation|4]maps directly to equation|2l Clearly this requires that 



3.2. The N-beam Sault-Wieringa algorithm 

Sault and Wieringa ('1994') developed an algorithm which, al- 
though it was set down only for N = 2, is easily generalized to 
the following; 

- Generate initial Rj from D • Z?, (the • here represents corre- 
lation). 

- Generate, for / e [0 ; A^ - 1] and j e [0 : N - I], cross- 
correlation images Z,j = B,- • Bj. 

- Calculate a matrix M such that its (/, j)th element is given by 
Zij(r = 0). 

- Perform a number of iterations of the following procedure; 



1 . Find the location r,nax of the maximum value of the im- 
age formed from R - Y^ijRiRj^n- 

2. Perform, for each /, the subtractions 

N-\ 

Ri(r) = Ri{r) - ^ ^ Rj{r^.,^)Zij(r - r^,,). 

3. Form a vector Craw of the entangled clean components 

'^^/■(rmax)- 

4. Solve the equation c = MCraw to disentangle the clean 
components. 

5. Store these for later cleaned-image reconstruction. 

3.3. Orthogonality pros and cons 

The Sault-Wieringa deconvolution includes a matrix inversion. 
If two or more beams from the set of N are identical, the ma- 
trix M is singular, i.e. uninvertible. If no two beams are exactly 
identical, but two or more are similar (have a normalized scalar 
product close to unity), M will be formally invertible but can be 
expected to be ill-conditioned (i.e., result in large errors in the 
final clean components). Hence approximate orthogonality be- 
tween beams is desirable; but if this be established, little further 
reduction in error is to be gained by enforcing complete orthog- 
onality. 

There may however be other reasons to manipulate the 
beams, even to the point of transforming them into an orthonor- 
mal set, other than keeping the matrix M well-conditioned. Two 
such circumstances are described here. Firstly, the main aim of 
the Conway decomposition and subsequent vector cleaning is ar- 
guably to obtain a better image of the average sky brightness dis- 
tribution. This image is identical to the image of the zero-order 
beam coefficient if and only if every other beam is zero-valued 
at its centre - because a non-zero value of B,(0) equates to a 
non-zero average of the /th basis function over the frequency 
and time range of the observation. Thus in order to simplify 
the post-production of an average-brightness image, one would 
want to modify the set of beams before cleaning by subtracting 
B,(0) X Bo/Bq{Q) from each B, for / > 0. 

The second circumstance concerns the practicalities of com- 
puting the deconvolution. The Sault-Wieringa algorithm requires 
that N{N +1) images be kept in memory during the iteration. 
For A = 2, as treated by Sault and Wieringa, this is a man- 
ageable number of images. However, some of the scenarios dis- 
cussed in the present paper make use of values of A as high as 
30. Maintenance within memory of close to 1000 arrays, each 
of which may be as large as 2048 x 2048 pixels, is likely to 
prove a strain for most machines at least in the lower echelons 
of present-day computing power. The load may be much reduced 
by orthogonalizing the beams before the start of the algorithm, 
i.e. by generating a set of modified beams B'. such that B'. ■ B'. - 
for all / i^ j. The cross-correlation images Z'. generated from 
these will be such that their central values, thus the elements of 
M', will be for all / t^ j. There is no need to store or even 
generate the 'off-diagonal' images; the necessary inventory of 
images thus reduces to 2A. 

The transformation from the basis functions in equation |3] 
to beams in 'sky-image' space does not preserve orthogonality. 
Because of this there seems little to be gained by requiring the 
basis functions to be orthogonal; one must work directly with 
the dirty beams themselves. 

The Gram-Schmidt process of converting B, to orthonormal 
B'. can be expressed as the matrix equation 



b = Gb', 



(5) 
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where b is a vector of the A^ input dirty beams B,, b' is a vector 
of the orthonormalized beams B'., and G is a lower-triangular 
matrix with elements defined as follows: G,_, = B'. ■ B'. before BJ 
is normalized, and Gtj for j < i = B'. ■ B,-. 

As Golub & Van Loan d 19961 1 point out, the usual Gram- 
Schmidt procedure is known to be numerically unstable. These 
authors suggest an alternate, stabilized algorithm which amounts 
to solving equation |5]by backward substitution, the elements of 
G being calculated on the fly. 

The Sault-Wieringa algorithm, working with orthonormal- 
ized beams, produces, for each image pixel (x,y), a vector of 
clean components c' such that 

or, in vector notation, 

D = c'^b'. (6) 

How to reconstruct the clean components which give D in terms 
of the original beams? Inverting equation|5]and inserting|6]gives 

D = c'^G 'b. 

Obviously the desired clean components c are given by 

c^^c'^G-'. 

3.4. Post-processing 

The output of the Sault-Wieringa process is, as with Hogbom 
CLEAN, a list of clean components; but here each compo- 
nent is no longer scalar but is itself an A^-element vector or 
list. The question then arises of what to do with this infor- 
mation. Almost certainly one will want to make an image of 
the average sky brightness, since one of the motivations of the 
Conway decomposition has been to obtain a more accurate re- 
construction of the sky, free from the spectral artifacts which 
one sees in Hogbom-based deconvolution of wide-band observa- 
tions. One might think that similar images should also be made 
from each of the higher-order elements of the clean components. 
Experience shows however that such images rapidly become too 
noisy to be of use as the order increases. In broad terms this 
is because the higher-order basis functions tend to vary more 
rapidly with displacement in the UV plane; which means they 
contain more power in higher spatial frequencies in this domain; 
which translates to a broader, less centrally peaked structure for 
the respective beam in the sky domain. This broadening and di- 
luting of the beam profile for higher orders (visible in the exam- 
ple beam profiles of Conway et al l 19901 ) causes relatively more 
'false detections' at these orders to occur during the deconvo- 
lution, until at the end the real clean component values for this 
order may become lost among the false ones. The problem is 
exacerbated by the tendency of the true component values to de- 
crease (with any sensible choice of basis function) as the order 
increases. 

A possible remedy for this problem might be to choose 
smaller values of the loop gain A as the order of the basis func- 
tion increases. But we have not so far tested this conjecture. 

For constructing a restored image of the average brightness, 
it seems clear that the best thing to do is construct an average- 
brightness beam, and fit the restoring beam to its centre, in anal- 
ogy to the present standard practice when employing Hogbom 
CLEAN. However, there is no such obvious source of a restor- 
ing beam profile for higher-order images, since their respective 



beams don't necessarily have a peak in the centre to fit to. Use 
of the average-derived restoring beam in these cases is probably 
acceptable though, since the underlying spatial resolution should 
be similar. Another possibility would be to use a Gaussian of the 
same full width half maximum as the beam. 

4. Time-variable sources 

4.1. Expansion in basis functions 

It is a commonplace that radio continuum spectra tend to be 
smooth. In nearly all cases a continuum spectrum may be well 
approximated over an octave or so (a frequency range which 
the forthcoming wide-band interferometers are not expected to 
greatly exceed) by a polynomial of at most 2 or 3 orders. In this 
case there seems little to be gained by investigating the use of 
other types of basis function. Light curves, on the other hand, 
are much less well behaved: fluctuations can be expected over a 
wide range of timescales. Sault-Wieringa style parallel cleaning 
of observations which include time-variable sources may there- 
fore require expansion of the nominal light curve to higher order 
than is usually necessary for spectra. This also makes it of more 
interest to compare different basis expansions in this case. 

Note however that an observation which includes only a sin- 
gle variable source can be cleaned via the simple 2-beam tech- 
nique described in section 14.2.3 1 

All other aspects being equal, one would choose the basis set 
which resulted in the smallest residuals for a given A^ in equation 
|2] which is another way of saying that the sum over A^ 'compo- 
nent' dirty images D, should converge rapidly to the 'true' dirty 
image D as N increases. And rapid convergence in the sky plane 
is surely linked to rapid convergence of the basis function expan- 
sion in the U-V plane. However, it is difficult to predict which 
set will best meet this criterion, for two reasons. Firstly the light 
curves encountered in practice may be very diverse: a basis set 
which converges rapidly to one sort may not do so for another. 
Secondly, there appears to be at present no analysis which pre- 
dicts the image residuals. Thus trial and eiTor, as with traditional 
CLEAN, probably remains the best guide here too. 

In the present section we compare two sets of basis functions 
for the expansion of light curves, namely Fourier sinusoids and 
Chebyshev polynomials. 

If a standard Fourier series is employed, the expansion will 
suffer from Gibbs phenomenon at the boundaries, unless the 
light curve is periodic continuous (in fact periodic analytic) over 
the chosen observation interval. Gibbs ringing can however be 
much reduced by the following treatment. Let the duration of 
the observation be denoted by T, and the light curve at a given 
sky direction by /(f). Suppose one doubled the observation time, 
and filled the new interval with a function g defined such that 



g(t)^ 



( fit), 0<t<T 
\f(T-t),T<t< 2T. 



The new function g is periodic continuous in the interval [0, 2T], 
so there should be no zeroth-order Gibbs ringing in its Fourier 
expansion, g is also symmetric about zero, so only cosine terms 
will remain in the expansion. If the resulting Fourier basis func- 
tions are truncated back to [0, T] they are seen to be given by 



T„it) = cos(7Tnt/T) for n e [0, ..A^]. 



(7) 



Similar tricks can be used to enforce boundary dilTerentiability 
to higher orders if desired. 

If there are gaps in the time sequence of data values - patches 
of bad data perhaps, or even planned, periodic observations of a 
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phase calibrator - such gaps might be expected to cause addi- 
tional Gibbs-type problems. A normal Fourier expansion of a 
function which suddenly drops to zero at intervals would suffer 
Gibbs ringing at the discontinuities. However, Conway decom- 
position is not 'normal expansion' in the basis functions. The 
actual basis onto which the dirty image is projected consists of 
the set of beams; and the construction of each beam from the 
corresponding basis function in the UV plane effectively discon- 
nects it from any irregularities in the support of the basis func- 
tions. Orthonormal basis functions won't necessarily produce or- 
thonormal beams, and vice versa. 

A final point to note is that this technique cannot disentangle 
or unwrap an observation which spans more than 24 hours at the 
same frequency: in this case all that can be obtained is the net 
(cyclic superimposed) light curve. 

4.2. Simulations 

4.2.1 . A single time-variable source 

Figures [1] to |4] refer to a simulation constructed as follows. 
Artificial visibilities were calculated for a single time-varying 
source which was located at the phase centre. The array param- 
eters were those of MERLIN; the chosen source declination was 
H-40°; integration time was 10s, and 32 channels of 1 MHz width 
starting at 6 GHz were used. Perfect calibration was assumed 
and no instrumental noise was added. A gap was introduced into 
the data sequence between approximately 4 and 6.5 hours post- 
meridian. 

Figures [T] |2] and [3] show the effect of different basis expan- 
sions on the Gibbs phenomenon; figure |4] concerns the relative 
efficiency of Hogbom versus Sault-Wieringa cleans for time- 
variable sources. 

In figure [T] the light curve of the source (solid line) is com- 
pared firstly to a direct Fourier expansion of this light curve to 
order 10 (dashed line), and secondly to the light curve recon- 
structed after Conway decomposition into 6 beams (i.e. to order 
5) from the half-frequency cosine basis functions prescribed by 
equation|7l followed by 1000 cycles of Sault-Wieringa cleaning 
with loop gain 0.1 (dotted line). The Gibbs ringing of the stan- 
dard Fourier expansion at boundaries and other discontinuities is 
very obvious, as is the almost complete absence of such an effect 
in the Sault-Wieringa result. 

Figure [1] shows clearly that Gibbs phenomenon can be 
largely avoided in Sault-Wieringa cleaning of time-varying 
sources. It is of interest however to explore a little more deeply 
into the difference between boundary discontinuities and data 
gaps, and the relative efficacy of different basis functions. 
Figures |2] and [3] serve this purpose. For these figures, the same 
input data are used, but the plotting style is different to figure [1] 
Here are plotted, on a logarithmic scale, absolute values of the 
residuals of the curves of interest against the input light curve. 

In figure|2]the direct Fourier expansion (dashed line) is com- 
pared to the result of a Sault-Wieringa clean (half-tone solid line) 
as in figure [T] but here the unmodified Fourier functions, i.e. 
the same functions used in the direct expansion, were chosen as 
the basis set (which generated 21 beams!). As expected, in both 
cases there is Gibbs oscillation at the boundary, but only in the 
direct expansion is there also Gibbs at the edges of the data gap. 
The conclusion here is that it is unnecessary to choose the basis 
set with any care in order to avoid Gibbs ringing at data gaps: 
the Sault-Wieringa technique by its nature avoids such troubles. 
The same is however obviously not true of the problem at the 
boundaries, which arises from the inability of the Fourier basis 
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Fig. 1. Direct Fourier expansion of a simulated light curve com- 
pared to the result of a Sault-Wieringa clean using a set of basis 
functions which suppresses Gibbs oscillation at the boundaries. 
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Fig. 2. The same light curve as in figure[T]but now with residuals 
plotted. Shown are the residuals from the Fourier expansion of 
figure [1] (dashed curve) and the residuals from a Sault-Wieringa 
clean using the same set of Fourier basis functions (half-tone 
solid curve). For better clarity, values of small magnitude have 
not been plotted; and a different cutoff was used for each curve. 
Points falling within the data gap have also been omitted. 



set to represent a function with a discontinuity at the boundary. 
In effect, the Sault-Wieringa deconvolution can interpolate over 
a gap, but cannot (without 'outside help') deal with a disconti- 
nuity. 

Figure |3] again shows residuals from the direct Fourier ex- 
pansion (dashed line) but compares them here to residuals from 
two different Sault-Wieringa deconvolutions: using firstly the 
half-frequency cosine basis of equation|7](dotted line, as already 
displayed in figure [TJ, and secondly, Chebyshev polynomials 
(dot-dashed line). Both expansions were truncated at order 5. 
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Fig. 3. This is similar to figure |2l The dashed curve again rep- 
resents the Fourier residuals, whereas the dash-dot and dotted 
curves are residuals from a Sault-Wieringa clean using two dif- 
ferent types of basis function. For the dash-dot curve, Chebyshev 
polynomials up to order 5 were employed to generate the Sault- 
Wieringa beams; the dotted curve is the result of using the half- 
frequency cosine functions (also to the order 5). 



It is perhaps slightly unexpected that the cosine functions seem 
to fit the light curve better than the Chebyshevs. 

Figure |4] compares a Hogbom clean (1000 cycles at gain 
0.1) of the same time-varying single source with Sault-Wieringa 
clean using the half- frequency cosine basis to order 5. (The 
Sault-Wieringa result is the same which generated the dotted 
curves in figures [1] and |3]) The data plotted in figure |4] come 
from images: the dirty image for the data set, and the clean and 
residual images from the Hogbom and Sault-Wieringa cleans re- 
spectively. What has been done for each image is to calculate the 
RMS of the image values in a polar coordinate system centred 
on the source at the phase centre. The RMS values have been 
plotted as a function of radius from this centre. 

Use of the Sault-Wieringa procedure is seen to improve the 
dynamic range by about 2 orders of magnitude to almost 10'*. 

4.2.2. Several sources varying in frequency and time 

The simulation described in the present section was devised 
to be an exacting test of the Conway /Sault-Wieringa technique 
as applied to sources which vary both in frequency and time. 
The input data were derived from a model containing five point 
sources, each of which had a power-law spectrum. The first 
source was to serve as a reference, and so had a flat spectrum 
with no time variation. For the remaining sources, both the av- 
erage flux density and the spectral index were made to vary 
over time. The frequency and time dependences of the remaining 
sources are illustrated in figure|5] 

The array parameters for this simulation were again those 
of MERLIN, with a phase centre declination of +40°; however 
the bandwidth chosen this time spanned from 5 to 7 GHz, di- 
vided into 50 channels of width 40 MHz. The integration time 
was set to 60 sec. These chosen values for channel width and 
integration time are much coarser than those usually associated 
with MERLIN, and would set severe limits to the breadth of field 
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Fig. 4. Both sides of this plot show RMS image values as a 
function of radius from the source position. Values calculated 
from clean-component images appear in the left panel; those on 
the right come from residual images. The radial distribution of 
RMS values in the dirty image (solid line) is included in both 
panels for purposes of comparison. The dashed lines represent 
the Hogbom-cleaned data, whereas the dotted lines are from the 
Sault-Wieringa cleaning. 




Fig. 5. Surfaces showing the frequency and time dependence of 
the 4 variable sources. The height of each surface represents the 
flux density of that source. Major contours occur at 1 jansky in- 
tervals. The non-variable source had a flat spectrum at 1 Jy. Note 
that the order of the sources has been slightly rearranged for bet- 
ter display of the surfaces. 



in a real observation. In the present case, the field of interest is 
naiTow, so coarse values were chosen in order to reduce the com- 
puting load. The chosen values were the largest ones consistent 
with accurate gridding of the relatively small area of the field 
occupied by the simulated sources. 

The simulated visibilities were gridded with uniform weight- 
ing and subjected separately to 2000 cycles of Hogbom cleaning 
at a loop gain of 0. 1 versus the same number of cycles, at the 
same gain, of Sault-Wieringa cleaning, using FT basis functions 
as described in section 13.11 the F functions being Chebyshev 
polynomials (although to the small order used, these are scarcely 
distinguishable from Taylor series terms) and the T functions be- 
ing the half-frequency Fourier cosine functions of equation]?] 
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Fig. 6. Figure a is an image constructed from the average flux 
densities of the input sources to the simulation, b is the dirty im- 
age, c is the result of 2000 cycles of Hogbom clean at a loop gain 
of 0.1. d is the result of a Sault-Wieringa clean (with the same 
niter and loop gain) as described in the text. The unsaturated 
brightness range for all four images was -0.02 to H-0.02 Jy/beam. 
(For comparison purposes, the brightest pixel of the source im- 
age was 1.27 Jy/beam.) The grey scale is reversed. Black rect- 
angles indicate the area chosen for the RMS measurement as 
described in the text. 

Table 1. RMS values from cleaned images. 



Image name RMS (Jy) 

Dirty image 1.588e-2 

Hogbom cleaned 4.847e-3 

SW cleaned 5.415e-4 



A number of images relating to this simulation are shown in 
figure |6] 

An RMS value was calculated for those pixels within a rect- 
angular area which is shown on the plots by a black outline. 
These numbers are shown in table [1] The Sault-Wieringa tech- 
nique clearly offers a significant improvement in dynamic range. 

As discussed in section [3741 the value of this technique lies 
more in its ability to remove artifacts from an image of aver- 
age flux density, rather than as a way to estimate light curves or 
spectra. For wide-band observations in which there is no time 
variation over the observation, because of the typically slowly- 
varying nature of radio spectra, one may reasonably expect to 
extract low-order spectral information (such as spectral indices) 
from Conway decomposition. But because light curves may eas- 
ily contain significant power at higher orders of the time ba- 
sis functions, which are not so accurately recovered by the de- 
convolution, more caution is advisable in interpreting the time- 
dependent output of the deconvolution process. If an accurate 
light curve of a source is desired, it is probably always going to 
be preferable to just phase-rotate the array to that sky location. 

4.2.3. Simpler method for a single variable source 

A source which shows significant time variation within the 
course of a day must have a spatial dimension less than about 
a light-day across. When observed with, for example, the 
MERLIN array at 21 cm, any such source more distant than 
about a kiloparsec will be unresolved. Except in the case of 



Fig. 7. Figure a is an image constructed from the average flux 
densities of the input sources to the simulation, b is the dirty 
image, c is the result of 5000 cycles of Hogbom clean at a gain 
of 0.01. d shows the distribution of the zeroth component of a 
Sault-Wieringa clean, with similar gain and number of clean cy- 
cles, using the two-beam decomposition as described in the text. 
The unsaturated brightness scale was -0.005 to -1-0.005 for a and 
d, -0.05 to H-0.05 for b and c. The grey scale is reversed for all 
images. 



masers, it is also unusual to find two such variable sources in 
close proximity. For many observations of time-variable radio 
sources, therefore, we may expect the source to be point-like 
(unresolved) and the only source in the field which shows a sig- 
nificant time variation. In this case the light curve for the whole 
observed field, itotai, is just a simple sum of the source light curve 
■Ssource plus a timc-invariant background flux density. Any point 
in the image can therefore be expressed as a sum of just two 
basis functions: 

ro(0 = 1 

7^1(0 - Stotal -<itotal)- 

Here the mean notation () is used to indicate that beam 1 is con- 
structed as follows. A raw beam 1 is first formed by gridding, 
weighting and transforming the visibilities from a source at the 
phase centre which has a light curve given by >? total- An amount 
of beam is then subtracted from it such that the central value 
of the result is zero. Beam Bi is adjusted in this way so that 
the average flux-density information over the field is contained 
entirely in the distribution of component 0. 

A further simulation was constructed to test this technique. 
This simulation contained a time-variable point source located at 
the phase centre together with much fainter, extended emission 
(actually made up of 24 closely-spaced point sources) extending 
over about 0.2 arcsec (equal to 20 image pixels) either side of the 
central source. The average flux density of the central source was 
1 Jy/beam whereas the extended emission ranged in brightness 
from about 3.5 x lO"-' to 1.5 x lO"-' Jy/beam. The light curve 
of the central source was the same as diagrammed in figure [l] 
but without the data gap. It brightens by 3.7 magnitudes in the 
course of the observation. 

A quartet of images (similar to figure |6]l to exhibit the per- 
formance of this technique is shown in figure|2l For constructing 
the visibilities, MERLIN specifications were again used. The in- 
tegration time was 5 seconds, and 32 channels of width 1 MHz, 
starting at 6 GHz, were specified. 
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It is easily seen that the Sault-Wieringa deconvolution recov- 
ers almost all the faint emission. The Hogbom clean is unable to 
remove sidelobes at a level 10 times the flux density of the ex- 
tended emission and thus is incapable of revealing it. Several 
values of the Hogbom gain and number of iterations were tried 
with no improvement on what is displayed here. 

5. Conclusions 

In this paper, a technique developed originally by Conway et al 
(1199 0) and Sault and Wieringa ( 1994) to allow cleaning of multi- 
frequency-synthesis images has been generalized and shown to 
be applicable to earth-rotation-synthesis observations in which 
some of the sources vary significantly in brightness over the 
course of the observation. Sources which vary over the course 
of a day are not very common, but they do occur, and are some- 
times (in the case for example of novae) sources which it is of 
the highest interest to map accurately. But in fact one does not 
have to look for natural variations in flux density to encounter 
this problem: any movement of the primary beam of the array 
on the sky during an observation will generate artificial fluctua- 
tions, not only in the average flux density of sources, but, due to 
the frequency-dependent size of the primary beam, also in their 
spectral indices. Since some kind of pointing eiTor in a mechan- 
ically tracking dish is probably unavoidable, this is likely to be 
a problematic issue when performing any kind of wide-field or 
mosaiced observation (see e.g. Bhatnagar et al 120081) . particu- 
larly so in view of cuiTent hopes for improvements in dynamic 
range from the several wide-band, dish-antenna arrays, such as 
e VLA, eMERLIN and MeerKAT, which are cuiTently under con- 
struction. 

Deconvolution of time-varying sources reveals some issues 
which are not usually encountered in the frequency-synthesis 
case. The principal one of these is that choice of basis function 
is now of some importance. This issue was explored with some 
care, in particular the avoidance of Gibbs ringing when periodic 
basis functions are employed. It was also shown that, provided 
the time variation in the field is limited to a single, unresolved 
source, a particularly simple 2-beam technique can produce an 
almost perfect deconvolution. 

Whereas the original parallel decomposition treatment em- 
ployed at most 3 or 4 beams, some of the situations described 
in the present paper required as many as 30. Use of such large 
numbers of beams gives rise to a computational difficulty due 
to the requirement in the original Sault-Wieringa algorithm to 
store, for A^ input beams, of order A^^ cross-correlation images. 
It is shown here that this memory load can be much reduced if 
the set of beams is made orthogonal. An algorithm for doing this 
was proposed and it was shown how 'de-orthogonalized' clean 
components can be recovered at the end of the cleaning process 
via a simple matrix inversion. 

Acknowledgements. We thank the anonymous referee for some sound advice 
and for directing our attention to the compressive sensing papers. 
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